Skip to content

fix: preserve file parts in subtask prompts for multimodal subagents#20021

Closed
cyberprophet wants to merge 2 commits into
anomalyco:devfrom
cyberprophet:fix/subtask-preserve-file-parts
Closed

fix: preserve file parts in subtask prompts for multimodal subagents#20021
cyberprophet wants to merge 2 commits into
anomalyco:devfrom
cyberprophet:fix/subtask-preserve-file-parts

Conversation

@cyberprophet
Copy link
Copy Markdown
Contributor

@cyberprophet cyberprophet commented Mar 30, 2026

Issue for this PR

Closes #20001

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

When isSubtask is true in prompt.ts, the prompt assembly discards all non-text parts from input.parts. This means file content blocks (images, PDFs) passed by callers like the look_at tool are silently dropped before the subagent ever sees them.

The fix adds one line: filter input.parts for type === "file" entries and append them after the subtask part. The existing toModelMessages in message-v2.ts already handles file parts in user messages (lines 655-668), so no other changes are needed.

I hit this while building a plugin that delegates product image analysis to multimodal-looker. The agent always responded with "Could not access the image file" because the image content block was stripped. Traced it to the isSubtask gate at prompt.ts:1927-1942. The existing TODO comment on line 1938 acknowledges this gap.

How did you verify your code works?

  • All 146 session tests pass (bun test test/session/)
  • TypeScript typecheck passes across all 13 packages (bun turbo typecheck)
  • Manual test: called look_at with a product image → multimodal-looker received the image and returned structured analysis

Screenshots / recordings

N/A — not a UI change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

When `isSubtask` is true, the prompt assembly logic discards all
non-text parts from `input.parts`, including images and PDFs passed
as content blocks. This makes multimodal subagents like
multimodal-looker unable to receive any visual content — the agent
gets only the text instruction with no image data attached.

Preserve `input.parts` entries with `type === "file"` alongside the
subtask part so that vision-capable subagents can analyze images
passed through tools like `look_at`.

Closes anomalyco#20001

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added contributor needs:compliance This means the issue will auto-close after 2 hours. and removed needs:compliance This means the issue will auto-close after 2 hours. labels Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@rekram1-node
Copy link
Copy Markdown
Collaborator

Automated PR Cleanup

Thank you for contributing to opencode.

Due to the high volume of PRs from users and AI agents, we periodically close older PRs using automated criteria so maintainers can focus review time on the most active and community-supported contributions.

This PR was closed because it matched the following cleanup criteria:

  • The PR was created more than 1 month ago
  • The PR had fewer than 2 positive reactions
  • Positive reactions are counted as thumbs-up, heart, celebration, or rocket reactions on the PR

PRs created within the last month are not affected by this cleanup.

If you believe this PR was closed incorrectly, or if you are still actively working on it, please leave a comment explaining why it should be reopened. A maintainer can review and reopen it if appropriate.

Thanks again for taking the time to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Question: Can plugins access image bytes or local temp file paths for multimodal analysis?

2 participants